Dataset statistics
| Dataset A | Dataset B | |
|---|---|---|
| Number of variables | 12 | 12 |
| Number of observations | 446 | 446 |
| Missing cells | 426 | 443 |
| Missing cells (%) | 8.0% | 8.3% |
| Duplicate rows | 0 | 0 |
| Duplicate rows (%) | 0.0% | 0.0% |
| Total size in memory | 45.3 KiB | 45.3 KiB |
| Average record size in memory | 104.0 B | 104.0 B |
Variable types
| Dataset A | Dataset B | |
|---|---|---|
| Numeric | 5 | 5 |
| Categorical | 4 | 4 |
| Text | 3 | 3 |
| Dataset A | Dataset B | |
|---|---|---|
Age has 81 (18.2%) missing values | Age has 88 (19.7%) missing values | Missing |
Cabin has 344 (77.1%) missing values | Cabin has 353 (79.1%) missing values | Missing |
PassengerId has unique values | PassengerId has unique values | Unique |
Name has unique values | Name has unique values | Unique |
SibSp has 303 (67.9%) zeros | SibSp has 300 (67.3%) zeros | Zeros |
Parch has 337 (75.6%) zeros | Parch has 337 (75.6%) zeros | Zeros |
Fare has 7 (1.6%) zeros | Fare has 8 (1.8%) zeros | Zeros |
Reproduction
| Dataset A | Dataset B | |
|---|---|---|
| Analysis started | 2024-03-26 15:35:46.925469 | 2024-03-26 15:35:50.878274 |
| Analysis finished | 2024-03-26 15:35:50.877118 | 2024-03-26 15:35:54.036519 |
| Duration | 3.95 seconds | 3.16 seconds |
| Software version | ydata-profiling v0.0.dev0 | ydata-profiling v0.0.dev0 |
| Download configuration | config.json | config.json |
PassengerId
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 446 | 446 |
| Distinct (%) | 100.0% | 100.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 434.59865 | 450.8722 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 2 | 1 |
| Maximum | 891 | 891 |
| Zeros | 0 | 0 |
| Zeros (%) | 0.0% | 0.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 2 | 1 |
| 5-th percentile | 45.5 | 40 |
| Q1 | 184.5 | 226.25 |
| median | 440 | 459.5 |
| Q3 | 664.75 | 679.75 |
| 95-th percentile | 847.75 | 846 |
| Maximum | 891 | 891 |
| Range | 889 | 890 |
| Interquartile range (IQR) | 480.25 | 453.5 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 264.52706 | 259.52334 |
| Coefficient of variation (CV) | 0.60866976 | 0.57560289 |
| Kurtosis | -1.2862288 | -1.2357739 |
| Mean | 434.59865 | 450.8722 |
| Median Absolute Deviation (MAD) | 239.5 | 229.5 |
| Skewness | 0.046691704 | -0.062391757 |
| Sum | 193831 | 201089 |
| Variance | 69974.564 | 67352.363 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 733 | 1 | 0.2% |
| 508 | 1 | 0.2% |
| 153 | 1 | 0.2% |
| 653 | 1 | 0.2% |
| 628 | 1 | 0.2% |
| 497 | 1 | 0.2% |
| 488 | 1 | 0.2% |
| 704 | 1 | 0.2% |
| 442 | 1 | 0.2% |
| 280 | 1 | 0.2% |
| Other values (436) | 436 |
| Value | Count | Frequency (%) |
| 645 | 1 | 0.2% |
| 469 | 1 | 0.2% |
| 779 | 1 | 0.2% |
| 253 | 1 | 0.2% |
| 397 | 1 | 0.2% |
| 318 | 1 | 0.2% |
| 777 | 1 | 0.2% |
| 437 | 1 | 0.2% |
| 643 | 1 | 0.2% |
| 71 | 1 | 0.2% |
| Other values (436) | 436 |
| Value | Count | Frequency (%) |
| 2 | 1 | |
| 5 | 1 | |
| 11 | 1 | |
| 12 | 1 | |
| 14 | 1 | |
| 15 | 1 | |
| 18 | 1 | |
| 20 | 1 | |
| 21 | 1 | |
| 22 | 1 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 5 | 1 | |
| 8 | 1 | |
| 10 | 1 | |
| 11 | 1 | |
| 12 | 1 | |
| 14 | 1 | |
| 18 | 1 | |
| 20 | 1 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 5 | 1 | |
| 8 | 1 | |
| 10 | 1 | |
| 11 | 1 | |
| 12 | 1 | |
| 14 | 1 | |
| 18 | 1 | |
| 20 | 1 |
| Value | Count | Frequency (%) |
| 2 | 1 | |
| 5 | 1 | |
| 11 | 1 | |
| 12 | 1 | |
| 14 | 1 | |
| 15 | 1 | |
| 18 | 1 | |
| 20 | 1 | |
| 21 | 1 | |
| 22 | 1 |
Survived
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 2 | 2 |
| Distinct (%) | 0.4% | 0.4% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| 0 | |
|---|---|
| 1 |
| 0 | |
|---|---|
| 1 |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 446 | 446 |
| Distinct characters | 2 | 2 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 0 | 1 |
| 2nd row | 0 | 1 |
| 3rd row | 0 | 0 |
| 4th row | 0 | 0 |
| 5th row | 0 | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 271 | |
| 1 | 175 |
| Value | Count | Frequency (%) |
| 0 | 280 | |
| 1 | 166 |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| 0 | 271 | |
| 1 | 175 |
| Value | Count | Frequency (%) |
| 0 | 280 | |
| 1 | 166 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 271 | |
| 1 | 175 |
| Value | Count | Frequency (%) |
| 0 | 280 | |
| 1 | 166 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 271 | |
| 1 | 175 |
| Value | Count | Frequency (%) |
| 0 | 280 | |
| 1 | 166 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 271 | |
| 1 | 175 |
| Value | Count | Frequency (%) |
| 0 | 280 | |
| 1 | 166 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 271 | |
| 1 | 175 |
| Value | Count | Frequency (%) |
| 0 | 280 | |
| 1 | 166 |
Pclass
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 3 | 3 |
| Distinct (%) | 0.7% | 0.7% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| 3 | |
|---|---|
| 1 | |
| 2 |
| 3 | |
|---|---|
| 2 | |
| 1 |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 446 | 446 |
| Distinct characters | 3 | 3 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 2 | 3 |
| 2nd row | 2 | 1 |
| 3rd row | 1 | 2 |
| 4th row | 3 | 2 |
| 5th row | 2 | 2 |
Common Values
| Value | Count | Frequency (%) |
| 3 | 241 | |
| 1 | 113 | |
| 2 | 92 | 20.6% |
| Value | Count | Frequency (%) |
| 3 | 253 | |
| 2 | 98 | 22.0% |
| 1 | 95 | 21.3% |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| 3 | 241 | |
| 1 | 113 | |
| 2 | 92 | 20.6% |
| Value | Count | Frequency (%) |
| 3 | 253 | |
| 2 | 98 | 22.0% |
| 1 | 95 | 21.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 241 | |
| 1 | 113 | |
| 2 | 92 | 20.6% |
| Value | Count | Frequency (%) |
| 3 | 253 | |
| 2 | 98 | 22.0% |
| 1 | 95 | 21.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3 | 241 | |
| 1 | 113 | |
| 2 | 92 | 20.6% |
| Value | Count | Frequency (%) |
| 3 | 253 | |
| 2 | 98 | 22.0% |
| 1 | 95 | 21.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3 | 241 | |
| 1 | 113 | |
| 2 | 92 | 20.6% |
| Value | Count | Frequency (%) |
| 3 | 253 | |
| 2 | 98 | 22.0% |
| 1 | 95 | 21.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3 | 241 | |
| 1 | 113 | |
| 2 | 92 | 20.6% |
| Value | Count | Frequency (%) |
| 3 | 253 | |
| 2 | 98 | 22.0% |
| 1 | 95 | 21.3% |
Name
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 446 | 446 |
| Distinct (%) | 100.0% | 100.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 82 | 67 |
| Median length | 51 | 47 |
| Mean length | 27.215247 | 26.540359 |
| Min length | 13 | 12 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 12138 | 11837 |
| Distinct characters | 60 | 60 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 446 | 446 ? |
| Unique (%) | 100.0% | 100.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | Knight, Mr. Robert J | Baclini, Miss. Eugenie |
| 2nd row | Kirkland, Rev. Charles Leonard | Dodge, Master. Washington |
| 3rd row | Minahan, Dr. William Edward | Campbell, Mr. William |
| 4th row | Dantcheff, Mr. Ristiu | Reeves, Mr. David |
| 5th row | Renouf, Mr. Peter Henry | Collyer, Mr. Harvey |
| Value | Count | Frequency (%) |
| mr | 260 | 14.2% |
| miss | 88 | 4.8% |
| mrs | 68 | 3.7% |
| william | 27 | 1.5% |
| john | 24 | 1.3% |
| master | 19 | 1.0% |
| henry | 18 | 1.0% |
| george | 15 | 0.8% |
| charles | 13 | 0.7% |
| elizabeth | 13 | 0.7% |
| Other values (892) | 1284 |
| Value | Count | Frequency (%) |
| mr | 270 | 15.2% |
| miss | 88 | 4.9% |
| mrs | 59 | 3.3% |
| william | 36 | 2.0% |
| john | 20 | 1.1% |
| master | 19 | 1.1% |
| henry | 15 | 0.8% |
| charles | 13 | 0.7% |
| anna | 12 | 0.7% |
| james | 12 | 0.7% |
| Other values (875) | 1234 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1384 | 11.4% | |
| r | 997 | 8.2% |
| e | 870 | 7.2% |
| a | 827 | 6.8% |
| n | 669 | 5.5% |
| i | 653 | 5.4% |
| s | 641 | 5.3% |
| M | 561 | 4.6% |
| l | 533 | 4.4% |
| o | 503 | 4.1% |
| Other values (50) | 4500 |
| Value | Count | Frequency (%) |
| 1333 | 11.3% | |
| r | 983 | 8.3% |
| e | 850 | 7.2% |
| a | 813 | 6.9% |
| n | 679 | 5.7% |
| i | 669 | 5.7% |
| s | 637 | 5.4% |
| M | 554 | 4.7% |
| l | 527 | 4.5% |
| o | 476 | 4.0% |
| Other values (50) | 4316 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 12138 |
| Value | Count | Frequency (%) |
| (unknown) | 11837 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1384 | 11.4% | |
| r | 997 | 8.2% |
| e | 870 | 7.2% |
| a | 827 | 6.8% |
| n | 669 | 5.5% |
| i | 653 | 5.4% |
| s | 641 | 5.3% |
| M | 561 | 4.6% |
| l | 533 | 4.4% |
| o | 503 | 4.1% |
| Other values (50) | 4500 |
| Value | Count | Frequency (%) |
| 1333 | 11.3% | |
| r | 983 | 8.3% |
| e | 850 | 7.2% |
| a | 813 | 6.9% |
| n | 679 | 5.7% |
| i | 669 | 5.7% |
| s | 637 | 5.4% |
| M | 554 | 4.7% |
| l | 527 | 4.5% |
| o | 476 | 4.0% |
| Other values (50) | 4316 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 12138 |
| Value | Count | Frequency (%) |
| (unknown) | 11837 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1384 | 11.4% | |
| r | 997 | 8.2% |
| e | 870 | 7.2% |
| a | 827 | 6.8% |
| n | 669 | 5.5% |
| i | 653 | 5.4% |
| s | 641 | 5.3% |
| M | 561 | 4.6% |
| l | 533 | 4.4% |
| o | 503 | 4.1% |
| Other values (50) | 4500 |
| Value | Count | Frequency (%) |
| 1333 | 11.3% | |
| r | 983 | 8.3% |
| e | 850 | 7.2% |
| a | 813 | 6.9% |
| n | 679 | 5.7% |
| i | 669 | 5.7% |
| s | 637 | 5.4% |
| M | 554 | 4.7% |
| l | 527 | 4.5% |
| o | 476 | 4.0% |
| Other values (50) | 4316 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 12138 |
| Value | Count | Frequency (%) |
| (unknown) | 11837 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1384 | 11.4% | |
| r | 997 | 8.2% |
| e | 870 | 7.2% |
| a | 827 | 6.8% |
| n | 669 | 5.5% |
| i | 653 | 5.4% |
| s | 641 | 5.3% |
| M | 561 | 4.6% |
| l | 533 | 4.4% |
| o | 503 | 4.1% |
| Other values (50) | 4500 |
| Value | Count | Frequency (%) |
| 1333 | 11.3% | |
| r | 983 | 8.3% |
| e | 850 | 7.2% |
| a | 813 | 6.9% |
| n | 679 | 5.7% |
| i | 669 | 5.7% |
| s | 637 | 5.4% |
| M | 554 | 4.7% |
| l | 527 | 4.5% |
| o | 476 | 4.0% |
| Other values (50) | 4316 |
Sex
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 2 | 2 |
| Distinct (%) | 0.4% | 0.4% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| male | |
|---|---|
| female |
| male | |
|---|---|
| female |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 6 | 6 |
| Median length | 4 | 4 |
| Mean length | 4.7040359 | 4.6681614 |
| Min length | 4 | 4 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 2098 | 2082 |
| Distinct characters | 5 | 5 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | male | female |
| 2nd row | male | male |
| 3rd row | male | male |
| 4th row | male | male |
| 5th row | male | male |
Common Values
| Value | Count | Frequency (%) |
| male | 289 | |
| female | 157 |
| Value | Count | Frequency (%) |
| male | 297 | |
| female | 149 |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| male | 289 | |
| female | 157 |
| Value | Count | Frequency (%) |
| male | 297 | |
| female | 149 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 603 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 157 | 7.5% |
| Value | Count | Frequency (%) |
| e | 595 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 149 | 7.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2098 |
| Value | Count | Frequency (%) |
| (unknown) | 2082 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 603 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 157 | 7.5% |
| Value | Count | Frequency (%) |
| e | 595 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 149 | 7.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2098 |
| Value | Count | Frequency (%) |
| (unknown) | 2082 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 603 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 157 | 7.5% |
| Value | Count | Frequency (%) |
| e | 595 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 149 | 7.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2098 |
| Value | Count | Frequency (%) |
| (unknown) | 2082 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 603 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 157 | 7.5% |
| Value | Count | Frequency (%) |
| e | 595 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 149 | 7.2% |
Age
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 73 | 76 |
| Distinct (%) | 20.0% | 21.2% |
| Missing | 81 | 88 |
| Missing (%) | 18.2% | 19.7% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 29.671233 | 30.378492 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0.75 | 0.75 |
| Maximum | 80 | 80 |
| Zeros | 0 | 0 |
| Zeros (%) | 0.0% | 0.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0.75 | 0.75 |
| 5-th percentile | 5 | 4 |
| Q1 | 21 | 21 |
| median | 29 | 29 |
| Q3 | 37 | 39 |
| 95-th percentile | 55.4 | 59.15 |
| Maximum | 80 | 80 |
| Range | 79.25 | 79.25 |
| Interquartile range (IQR) | 16 | 18 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 14.17264 | 15.048925 |
| Coefficient of variation (CV) | 0.47765592 | 0.49538091 |
| Kurtosis | 0.42499776 | 0.062886566 |
| Mean | 29.671233 | 30.378492 |
| Median Absolute Deviation (MAD) | 8 | 9 |
| Skewness | 0.43737603 | 0.39996082 |
| Sum | 10830 | 10875.5 |
| Variance | 200.86372 | 226.47014 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 28 | 18 | 4.0% |
| 24 | 16 | 3.6% |
| 36 | 14 | 3.1% |
| 29 | 13 | 2.9% |
| 30 | 12 | 2.7% |
| 22 | 12 | 2.7% |
| 23 | 12 | 2.7% |
| 35 | 12 | 2.7% |
| 17 | 11 | 2.5% |
| 32 | 10 | 2.2% |
| Other values (63) | 235 | |
| (Missing) | 81 | 18.2% |
| Value | Count | Frequency (%) |
| 24 | 17 | 3.8% |
| 21 | 16 | 3.6% |
| 28 | 13 | 2.9% |
| 36 | 12 | 2.7% |
| 29 | 12 | 2.7% |
| 18 | 12 | 2.7% |
| 22 | 12 | 2.7% |
| 19 | 12 | 2.7% |
| 32 | 11 | 2.5% |
| 27 | 11 | 2.5% |
| Other values (66) | 230 | |
| (Missing) | 88 | 19.7% |
| Value | Count | Frequency (%) |
| 0.75 | 1 | 0.2% |
| 0.83 | 1 | 0.2% |
| 0.92 | 1 | 0.2% |
| 1 | 3 | |
| 2 | 3 | |
| 3 | 3 | |
| 4 | 6 | |
| 5 | 2 | 0.4% |
| 7 | 3 | |
| 8 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0.75 | 2 | 0.4% |
| 1 | 2 | 0.4% |
| 2 | 6 | |
| 3 | 2 | 0.4% |
| 4 | 7 | |
| 5 | 2 | 0.4% |
| 6 | 3 | |
| 7 | 3 | |
| 8 | 3 | |
| 9 | 3 |
| Value | Count | Frequency (%) |
| 0.75 | 2 | 0.4% |
| 1 | 2 | 0.4% |
| 2 | 6 | |
| 3 | 2 | 0.4% |
| 4 | 7 | |
| 5 | 2 | 0.4% |
| 6 | 3 | |
| 7 | 3 | |
| 8 | 3 | |
| 9 | 3 |
| Value | Count | Frequency (%) |
| 0.75 | 1 | 0.2% |
| 0.83 | 1 | 0.2% |
| 0.92 | 1 | 0.2% |
| 1 | 3 | |
| 2 | 3 | |
| 3 | 3 | |
| 4 | 6 | |
| 5 | 2 | 0.4% |
| 7 | 3 | |
| 8 | 2 | 0.4% |
SibSp
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 7 | 7 |
| Distinct (%) | 1.6% | 1.6% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 0.58520179 | 0.57623318 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 8 | 8 |
| Zeros | 303 | 300 |
| Zeros (%) | 67.9% | 67.3% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 0 | 0 |
| Q1 | 0 | 0 |
| median | 0 | 0 |
| Q3 | 1 | 1 |
| 95-th percentile | 3 | 3 |
| Maximum | 8 | 8 |
| Range | 8 | 8 |
| Interquartile range (IQR) | 1 | 1 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 1.2436466 | 1.2094275 |
| Coefficient of variation (CV) | 2.1251586 | 2.0988508 |
| Kurtosis | 14.765281 | 16.092636 |
| Mean | 0.58520179 | 0.57623318 |
| Median Absolute Deviation (MAD) | 0 | 0 |
| Skewness | 3.4728816 | 3.5759277 |
| Sum | 261 | 257 |
| Variance | 1.5466569 | 1.4627148 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 303 | |
| 1 | 98 | 22.0% |
| 2 | 16 | 3.6% |
| 3 | 10 | 2.2% |
| 4 | 9 | 2.0% |
| 8 | 5 | 1.1% |
| 5 | 5 | 1.1% |
| Value | Count | Frequency (%) |
| 0 | 300 | |
| 1 | 102 | 22.9% |
| 2 | 18 | 4.0% |
| 4 | 12 | 2.7% |
| 3 | 7 | 1.6% |
| 8 | 5 | 1.1% |
| 5 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 303 | |
| 1 | 98 | 22.0% |
| 2 | 16 | 3.6% |
| 3 | 10 | 2.2% |
| 4 | 9 | 2.0% |
| 5 | 5 | 1.1% |
| 8 | 5 | 1.1% |
| Value | Count | Frequency (%) |
| 0 | 300 | |
| 1 | 102 | 22.9% |
| 2 | 18 | 4.0% |
| 3 | 7 | 1.6% |
| 4 | 12 | 2.7% |
| 5 | 2 | 0.4% |
| 8 | 5 | 1.1% |
| Value | Count | Frequency (%) |
| 0 | 300 | |
| 1 | 102 | 22.9% |
| 2 | 18 | 4.0% |
| 3 | 7 | 1.6% |
| 4 | 12 | 2.7% |
| 5 | 2 | 0.4% |
| 8 | 5 | 1.1% |
| Value | Count | Frequency (%) |
| 0 | 303 | |
| 1 | 98 | 22.0% |
| 2 | 16 | 3.6% |
| 3 | 10 | 2.2% |
| 4 | 9 | 2.0% |
| 5 | 5 | 1.1% |
| 8 | 5 | 1.1% |
Parch
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 7 | 7 |
| Distinct (%) | 1.6% | 1.6% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 0.40134529 | 0.38789238 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 6 | 6 |
| Zeros | 337 | 337 |
| Zeros (%) | 75.6% | 75.6% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 0 | 0 |
| Q1 | 0 | 0 |
| median | 0 | 0 |
| Q3 | 0 | 0 |
| 95-th percentile | 2 | 2 |
| Maximum | 6 | 6 |
| Range | 6 | 6 |
| Interquartile range (IQR) | 0 | 0 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 0.8415598 | 0.82909809 |
| Coefficient of variation (CV) | 2.0968473 | 2.1374436 |
| Kurtosis | 10.294673 | 11.679163 |
| Mean | 0.40134529 | 0.38789238 |
| Median Absolute Deviation (MAD) | 0 | 0 |
| Skewness | 2.7901656 | 2.9745314 |
| Sum | 179 | 173 |
| Variance | 0.70822291 | 0.68740364 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 337 | |
| 1 | 57 | 12.8% |
| 2 | 44 | 9.9% |
| 5 | 3 | 0.7% |
| 3 | 3 | 0.7% |
| 4 | 1 | 0.2% |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 337 | |
| 1 | 64 | 14.3% |
| 2 | 37 | 8.3% |
| 5 | 3 | 0.7% |
| 4 | 2 | 0.4% |
| 3 | 2 | 0.4% |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 337 | |
| 1 | 57 | 12.8% |
| 2 | 44 | 9.9% |
| 3 | 3 | 0.7% |
| 4 | 1 | 0.2% |
| 5 | 3 | 0.7% |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 337 | |
| 1 | 64 | 14.3% |
| 2 | 37 | 8.3% |
| 3 | 2 | 0.4% |
| 4 | 2 | 0.4% |
| 5 | 3 | 0.7% |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 337 | |
| 1 | 64 | 14.3% |
| 2 | 37 | 8.3% |
| 3 | 2 | 0.4% |
| 4 | 2 | 0.4% |
| 5 | 3 | 0.7% |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 337 | |
| 1 | 57 | 12.8% |
| 2 | 44 | 9.9% |
| 3 | 3 | 0.7% |
| 4 | 1 | 0.2% |
| 5 | 3 | 0.7% |
| 6 | 1 | 0.2% |
Ticket
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 388 | 380 |
| Distinct (%) | 87.0% | 85.2% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 18 | 18 |
| Median length | 17 | 17 |
| Mean length | 6.5717489 | 6.8654709 |
| Min length | 4 | 3 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 2931 | 3062 |
| Distinct characters | 31 | 31 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 347 | 335 ? |
| Unique (%) | 77.8% | 75.1% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 239855 | 2666 |
| 2nd row | 219533 | 33638 |
| 3rd row | 19928 | 239853 |
| 4th row | 349203 | C.A. 17248 |
| 5th row | 31027 | C.A. 31921 |
| Value | Count | Frequency (%) |
| pc | 36 | 6.4% |
| ca | 12 | 2.1% |
| c.a | 9 | 1.6% |
| a/5 | 8 | 1.4% |
| 2144 | 6 | 1.1% |
| sc/paris | 6 | 1.1% |
| 2343 | 5 | 0.9% |
| 1601 | 5 | 0.9% |
| w./c | 5 | 0.9% |
| 347082 | 4 | 0.7% |
| Other values (406) | 464 |
| Value | Count | Frequency (%) |
| pc | 23 | 4.0% |
| c.a | 12 | 2.1% |
| ca | 8 | 1.4% |
| 2 | 8 | 1.4% |
| ston/o | 8 | 1.4% |
| a/5 | 8 | 1.4% |
| sc/paris | 6 | 1.1% |
| 2343 | 5 | 0.9% |
| f.c.c | 5 | 0.9% |
| w./c | 4 | 0.7% |
| Other values (401) | 483 |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 367 | |
| 1 | 339 | |
| 2 | 287 | |
| 7 | 259 | |
| 4 | 252 | |
| 0 | 202 | 6.9% |
| 5 | 198 | 6.8% |
| 6 | 195 | 6.7% |
| 9 | 151 | 5.2% |
| 8 | 143 | 4.9% |
| Other values (21) | 538 |
| Value | Count | Frequency (%) |
| 3 | 394 | |
| 1 | 348 | |
| 2 | 289 | |
| 7 | 244 | 8.0% |
| 4 | 241 | 7.9% |
| 6 | 202 | 6.6% |
| 0 | 201 | 6.6% |
| 5 | 197 | 6.4% |
| 9 | 167 | 5.5% |
| 8 | 136 | 4.4% |
| Other values (21) | 643 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2931 |
| Value | Count | Frequency (%) |
| (unknown) | 3062 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3 | 367 | |
| 1 | 339 | |
| 2 | 287 | |
| 7 | 259 | |
| 4 | 252 | |
| 0 | 202 | 6.9% |
| 5 | 198 | 6.8% |
| 6 | 195 | 6.7% |
| 9 | 151 | 5.2% |
| 8 | 143 | 4.9% |
| Other values (21) | 538 |
| Value | Count | Frequency (%) |
| 3 | 394 | |
| 1 | 348 | |
| 2 | 289 | |
| 7 | 244 | 8.0% |
| 4 | 241 | 7.9% |
| 6 | 202 | 6.6% |
| 0 | 201 | 6.6% |
| 5 | 197 | 6.4% |
| 9 | 167 | 5.5% |
| 8 | 136 | 4.4% |
| Other values (21) | 643 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2931 |
| Value | Count | Frequency (%) |
| (unknown) | 3062 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3 | 367 | |
| 1 | 339 | |
| 2 | 287 | |
| 7 | 259 | |
| 4 | 252 | |
| 0 | 202 | 6.9% |
| 5 | 198 | 6.8% |
| 6 | 195 | 6.7% |
| 9 | 151 | 5.2% |
| 8 | 143 | 4.9% |
| Other values (21) | 538 |
| Value | Count | Frequency (%) |
| 3 | 394 | |
| 1 | 348 | |
| 2 | 289 | |
| 7 | 244 | 8.0% |
| 4 | 241 | 7.9% |
| 6 | 202 | 6.6% |
| 0 | 201 | 6.6% |
| 5 | 197 | 6.4% |
| 9 | 167 | 5.5% |
| 8 | 136 | 4.4% |
| Other values (21) | 643 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2931 |
| Value | Count | Frequency (%) |
| (unknown) | 3062 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3 | 367 | |
| 1 | 339 | |
| 2 | 287 | |
| 7 | 259 | |
| 4 | 252 | |
| 0 | 202 | 6.9% |
| 5 | 198 | 6.8% |
| 6 | 195 | 6.7% |
| 9 | 151 | 5.2% |
| 8 | 143 | 4.9% |
| Other values (21) | 538 |
| Value | Count | Frequency (%) |
| 3 | 394 | |
| 1 | 348 | |
| 2 | 289 | |
| 7 | 244 | 8.0% |
| 4 | 241 | 7.9% |
| 6 | 202 | 6.6% |
| 0 | 201 | 6.6% |
| 5 | 197 | 6.4% |
| 9 | 167 | 5.5% |
| 8 | 136 | 4.4% |
| Other values (21) | 643 |
Fare
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 193 | 178 |
| Distinct (%) | 43.3% | 39.9% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 32.048214 | 30.853717 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 512.3292 | 512.3292 |
| Zeros | 7 | 8 |
| Zeros (%) | 1.6% | 1.8% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 7.225 | 7.162525 |
| Q1 | 7.925 | 7.8958 |
| median | 15.0229 | 13.5 |
| Q3 | 32.8302 | 30.0708 |
| 95-th percentile | 110.8833 | 91.0792 |
| Maximum | 512.3292 | 512.3292 |
| Range | 512.3292 | 512.3292 |
| Interquartile range (IQR) | 24.9052 | 22.175 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 46.036861 | 50.011745 |
| Coefficient of variation (CV) | 1.4364876 | 1.620931 |
| Kurtosis | 32.473246 | 41.650934 |
| Mean | 32.048214 | 30.853717 |
| Median Absolute Deviation (MAD) | 7.7729 | 6.25 |
| Skewness | 4.5897746 | 5.457171 |
| Sum | 14293.504 | 13760.758 |
| Variance | 2119.3926 | 2501.1747 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 13 | 23 | 5.2% |
| 7.8958 | 19 | 4.3% |
| 8.05 | 19 | 4.3% |
| 26 | 16 | 3.6% |
| 7.75 | 15 | 3.4% |
| 10.5 | 9 | 2.0% |
| 26.55 | 8 | 1.8% |
| 0 | 7 | 1.6% |
| 7.225 | 7 | 1.6% |
| 7.925 | 7 | 1.6% |
| Other values (183) | 316 |
| Value | Count | Frequency (%) |
| 8.05 | 21 | 4.7% |
| 7.75 | 21 | 4.7% |
| 13 | 20 | 4.5% |
| 7.8958 | 18 | 4.0% |
| 26 | 15 | 3.4% |
| 10.5 | 13 | 2.9% |
| 7.925 | 11 | 2.5% |
| 7.25 | 9 | 2.0% |
| 0 | 8 | 1.8% |
| 26.55 | 7 | 1.6% |
| Other values (168) | 303 |
| Value | Count | Frequency (%) |
| 0 | 7 | |
| 6.75 | 2 | 0.4% |
| 6.8583 | 1 | 0.2% |
| 6.95 | 1 | 0.2% |
| 6.975 | 2 | 0.4% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 3 | |
| 7.0542 | 1 | 0.2% |
| 7.125 | 1 | 0.2% |
| 7.1417 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 8 | |
| 6.2375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 1 | 0.2% |
| 6.8583 | 1 | 0.2% |
| 6.975 | 2 | 0.4% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 2 | 0.4% |
| 7.0542 | 2 | 0.4% |
| 7.125 | 3 | 0.7% |
| Value | Count | Frequency (%) |
| 0 | 8 | |
| 6.2375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 1 | 0.2% |
| 6.8583 | 1 | 0.2% |
| 6.975 | 2 | 0.4% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 2 | 0.4% |
| 7.0542 | 2 | 0.4% |
| 7.125 | 3 | 0.7% |
| Value | Count | Frequency (%) |
| 0 | 7 | |
| 6.75 | 2 | 0.4% |
| 6.8583 | 1 | 0.2% |
| 6.95 | 1 | 0.2% |
| 6.975 | 2 | 0.4% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 3 | |
| 7.0542 | 1 | 0.2% |
| 7.125 | 1 | 0.2% |
| 7.1417 | 1 | 0.2% |
Cabin
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 86 | 77 |
| Distinct (%) | 84.3% | 82.8% |
| Missing | 344 | 353 |
| Missing (%) | 77.1% | 79.1% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 11 | 15 |
| Median length | 3 | 3 |
| Mean length | 3.3431373 | 3.4516129 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 341 | 321 |
| Distinct characters | 18 | 18 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 71 | 62 ? |
| Unique (%) | 69.6% | 66.7% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | C78 | A34 |
| 2nd row | B20 | C93 |
| 3rd row | D36 | C46 |
| 4th row | B35 | B28 |
| 5th row | C2 | E67 |
| Value | Count | Frequency (%) |
| b96 | 3 | 2.6% |
| b98 | 3 | 2.6% |
| f4 | 2 | 1.8% |
| b35 | 2 | 1.8% |
| b20 | 2 | 1.8% |
| c2 | 2 | 1.8% |
| g6 | 2 | 1.8% |
| g73 | 2 | 1.8% |
| f | 2 | 1.8% |
| b5 | 2 | 1.8% |
| Other values (82) | 92 |
| Value | Count | Frequency (%) |
| g6 | 3 | 2.9% |
| b5 | 2 | 1.9% |
| c68 | 2 | 1.9% |
| b28 | 2 | 1.9% |
| b98 | 2 | 1.9% |
| b96 | 2 | 1.9% |
| f | 2 | 1.9% |
| d35 | 2 | 1.9% |
| b49 | 2 | 1.9% |
| e67 | 2 | 1.9% |
| Other values (77) | 84 |
Most occurring characters
| Value | Count | Frequency (%) |
| C | 37 | |
| 2 | 35 | 10.3% |
| 3 | 32 | 9.4% |
| B | 27 | 7.9% |
| 5 | 24 | 7.0% |
| 0 | 22 | 6.5% |
| 1 | 22 | 6.5% |
| 6 | 20 | 5.9% |
| 8 | 17 | 5.0% |
| 9 | 16 | 4.7% |
| Other values (8) | 89 |
| Value | Count | Frequency (%) |
| 1 | 35 | |
| B | 31 | 9.7% |
| 2 | 26 | 8.1% |
| 6 | 25 | 7.8% |
| 3 | 25 | 7.8% |
| C | 20 | 6.2% |
| E | 20 | 6.2% |
| 5 | 20 | 6.2% |
| 4 | 18 | 5.6% |
| D | 15 | 4.7% |
| Other values (8) | 86 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 341 |
| Value | Count | Frequency (%) |
| (unknown) | 321 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| C | 37 | |
| 2 | 35 | 10.3% |
| 3 | 32 | 9.4% |
| B | 27 | 7.9% |
| 5 | 24 | 7.0% |
| 0 | 22 | 6.5% |
| 1 | 22 | 6.5% |
| 6 | 20 | 5.9% |
| 8 | 17 | 5.0% |
| 9 | 16 | 4.7% |
| Other values (8) | 89 |
| Value | Count | Frequency (%) |
| 1 | 35 | |
| B | 31 | 9.7% |
| 2 | 26 | 8.1% |
| 6 | 25 | 7.8% |
| 3 | 25 | 7.8% |
| C | 20 | 6.2% |
| E | 20 | 6.2% |
| 5 | 20 | 6.2% |
| 4 | 18 | 5.6% |
| D | 15 | 4.7% |
| Other values (8) | 86 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 341 |
| Value | Count | Frequency (%) |
| (unknown) | 321 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| C | 37 | |
| 2 | 35 | 10.3% |
| 3 | 32 | 9.4% |
| B | 27 | 7.9% |
| 5 | 24 | 7.0% |
| 0 | 22 | 6.5% |
| 1 | 22 | 6.5% |
| 6 | 20 | 5.9% |
| 8 | 17 | 5.0% |
| 9 | 16 | 4.7% |
| Other values (8) | 89 |
| Value | Count | Frequency (%) |
| 1 | 35 | |
| B | 31 | 9.7% |
| 2 | 26 | 8.1% |
| 6 | 25 | 7.8% |
| 3 | 25 | 7.8% |
| C | 20 | 6.2% |
| E | 20 | 6.2% |
| 5 | 20 | 6.2% |
| 4 | 18 | 5.6% |
| D | 15 | 4.7% |
| Other values (8) | 86 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 341 |
| Value | Count | Frequency (%) |
| (unknown) | 321 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| C | 37 | |
| 2 | 35 | 10.3% |
| 3 | 32 | 9.4% |
| B | 27 | 7.9% |
| 5 | 24 | 7.0% |
| 0 | 22 | 6.5% |
| 1 | 22 | 6.5% |
| 6 | 20 | 5.9% |
| 8 | 17 | 5.0% |
| 9 | 16 | 4.7% |
| Other values (8) | 89 |
| Value | Count | Frequency (%) |
| 1 | 35 | |
| B | 31 | 9.7% |
| 2 | 26 | 8.1% |
| 6 | 25 | 7.8% |
| 3 | 25 | 7.8% |
| C | 20 | 6.2% |
| E | 20 | 6.2% |
| 5 | 20 | 6.2% |
| 4 | 18 | 5.6% |
| D | 15 | 4.7% |
| Other values (8) | 86 |
Embarked
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 3 | 3 |
| Distinct (%) | 0.7% | 0.7% |
| Missing | 1 | 2 |
| Missing (%) | 0.2% | 0.4% |
| Memory size | 7.0 KiB | 7.0 KiB |
| S | |
|---|---|
| C | |
| Q |
| S | |
|---|---|
| C | |
| Q |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 445 | 444 |
| Distinct characters | 3 | 3 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | S | C |
| 2nd row | Q | S |
| 3rd row | Q | S |
| 4th row | S | S |
| 5th row | S | S |
Common Values
| Value | Count | Frequency (%) |
| S | 325 | |
| C | 82 | 18.4% |
| Q | 38 | 8.5% |
| (Missing) | 1 | 0.2% |
| Value | Count | Frequency (%) |
| S | 329 | |
| C | 75 | 16.8% |
| Q | 40 | 9.0% |
| (Missing) | 2 | 0.4% |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| s | 325 | |
| c | 82 | 18.4% |
| q | 38 | 8.5% |
| Value | Count | Frequency (%) |
| s | 329 | |
| c | 75 | 16.9% |
| q | 40 | 9.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| S | 325 | |
| C | 82 | 18.4% |
| Q | 38 | 8.5% |
| Value | Count | Frequency (%) |
| S | 329 | |
| C | 75 | 16.9% |
| Q | 40 | 9.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 445 |
| Value | Count | Frequency (%) |
| (unknown) | 444 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| S | 325 | |
| C | 82 | 18.4% |
| Q | 38 | 8.5% |
| Value | Count | Frequency (%) |
| S | 329 | |
| C | 75 | 16.9% |
| Q | 40 | 9.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 445 |
| Value | Count | Frequency (%) |
| (unknown) | 444 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| S | 325 | |
| C | 82 | 18.4% |
| Q | 38 | 8.5% |
| Value | Count | Frequency (%) |
| S | 329 | |
| C | 75 | 16.9% |
| Q | 40 | 9.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 445 |
| Value | Count | Frequency (%) |
| (unknown) | 444 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| S | 325 | |
| C | 82 | 18.4% |
| Q | 38 | 8.5% |
| Value | Count | Frequency (%) |
| S | 329 | |
| C | 75 | 16.9% |
| Q | 40 | 9.0% |
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 732 | 733 | 0 | 2 | Knight, Mr. Robert J | male | NaN | 0 | 0 | 239855 | 0.0000 | NaN | S |
| 626 | 627 | 0 | 2 | Kirkland, Rev. Charles Leonard | male | 57.0 | 0 | 0 | 219533 | 12.3500 | NaN | Q |
| 245 | 246 | 0 | 1 | Minahan, Dr. William Edward | male | 44.0 | 2 | 0 | 19928 | 90.0000 | C78 | Q |
| 794 | 795 | 0 | 3 | Dantcheff, Mr. Ristiu | male | 25.0 | 0 | 0 | 349203 | 7.8958 | NaN | S |
| 476 | 477 | 0 | 2 | Renouf, Mr. Peter Henry | male | 34.0 | 1 | 0 | 31027 | 21.0000 | NaN | S |
| 690 | 691 | 1 | 1 | Dick, Mr. Albert Adrian | male | 31.0 | 1 | 0 | 17474 | 57.0000 | B20 | S |
| 705 | 706 | 0 | 2 | Morley, Mr. Henry Samuel ("Mr Henry Marshall") | male | 39.0 | 0 | 0 | 250655 | 26.0000 | NaN | S |
| 161 | 162 | 1 | 2 | Watt, Mrs. James (Elizabeth "Bessie" Inglis Milne) | female | 40.0 | 0 | 0 | C.A. 33595 | 15.7500 | NaN | S |
| 795 | 796 | 0 | 2 | Otter, Mr. Richard | male | 39.0 | 0 | 0 | 28213 | 13.0000 | NaN | S |
| 753 | 754 | 0 | 3 | Jonkoff, Mr. Lalio | male | 23.0 | 0 | 0 | 349204 | 7.8958 | NaN | S |
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 644 | 645 | 1 | 3 | Baclini, Miss. Eugenie | female | 0.75 | 2 | 1 | 2666 | 19.2583 | NaN | C |
| 445 | 446 | 1 | 1 | Dodge, Master. Washington | male | 4.00 | 0 | 2 | 33638 | 81.8583 | A34 | S |
| 466 | 467 | 0 | 2 | Campbell, Mr. William | male | NaN | 0 | 0 | 239853 | 0.0000 | NaN | S |
| 265 | 266 | 0 | 2 | Reeves, Mr. David | male | 36.00 | 0 | 0 | C.A. 17248 | 10.5000 | NaN | S |
| 637 | 638 | 0 | 2 | Collyer, Mr. Harvey | male | 31.00 | 1 | 1 | C.A. 31921 | 26.2500 | NaN | S |
| 237 | 238 | 1 | 2 | Collyer, Miss. Marjorie "Lottie" | female | 8.00 | 0 | 2 | C.A. 31921 | 26.2500 | NaN | S |
| 531 | 532 | 0 | 3 | Toufik, Mr. Nakli | male | NaN | 0 | 0 | 2641 | 7.2292 | NaN | C |
| 603 | 604 | 0 | 3 | Torber, Mr. Ernst William | male | 44.00 | 0 | 0 | 364511 | 8.0500 | NaN | S |
| 816 | 817 | 0 | 3 | Heininen, Miss. Wendla Maria | female | 23.00 | 0 | 0 | STON/O2. 3101290 | 7.9250 | NaN | S |
| 469 | 470 | 1 | 3 | Baclini, Miss. Helene Barbara | female | 0.75 | 2 | 1 | 2666 | 19.2583 | NaN | C |
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 43 | 44 | 1 | 2 | Laroche, Miss. Simonne Marie Anne Andree | female | 3.0 | 1 | 2 | SC/Paris 2123 | 41.5792 | NaN | C |
| 272 | 273 | 1 | 2 | Mellinger, Mrs. (Elizabeth Anne Maidment) | female | 41.0 | 0 | 1 | 250644 | 19.5000 | NaN | S |
| 649 | 650 | 1 | 3 | Stanley, Miss. Amy Zillah Elsie | female | 23.0 | 0 | 0 | CA. 2314 | 7.5500 | NaN | S |
| 418 | 419 | 0 | 2 | Matthews, Mr. William John | male | 30.0 | 0 | 0 | 28228 | 13.0000 | NaN | S |
| 211 | 212 | 1 | 2 | Cameron, Miss. Clear Annie | female | 35.0 | 0 | 0 | F.C.C. 13528 | 21.0000 | NaN | S |
| 660 | 661 | 1 | 1 | Frauenthal, Dr. Henry William | male | 50.0 | 2 | 0 | PC 17611 | 133.6500 | NaN | S |
| 177 | 178 | 0 | 1 | Isham, Miss. Ann Elizabeth | female | 50.0 | 0 | 0 | PC 17595 | 28.7125 | C49 | C |
| 781 | 782 | 1 | 1 | Dick, Mrs. Albert Adrian (Vera Gillespie) | female | 17.0 | 1 | 0 | 17474 | 57.0000 | B20 | S |
| 121 | 122 | 0 | 3 | Moore, Mr. Leonard Charles | male | NaN | 0 | 0 | A4. 54510 | 8.0500 | NaN | S |
| 625 | 626 | 0 | 1 | Sutton, Mr. Frederick | male | 61.0 | 0 | 0 | 36963 | 32.3208 | D50 | S |
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 604 | 605 | 1 | 1 | Homer, Mr. Harry ("Mr E Haven") | male | 35.0 | 0 | 0 | 111426 | 26.5500 | NaN | C |
| 231 | 232 | 0 | 3 | Larsson, Mr. Bengt Edvin | male | 29.0 | 0 | 0 | 347067 | 7.7750 | NaN | S |
| 553 | 554 | 1 | 3 | Leeni, Mr. Fahim ("Philip Zenni") | male | 22.0 | 0 | 0 | 2620 | 7.2250 | NaN | C |
| 774 | 775 | 1 | 2 | Hocking, Mrs. Elizabeth (Eliza Needs) | female | 54.0 | 1 | 3 | 29105 | 23.0000 | NaN | S |
| 570 | 571 | 1 | 2 | Harris, Mr. George | male | 62.0 | 0 | 0 | S.W./PP 752 | 10.5000 | NaN | S |
| 126 | 127 | 0 | 3 | McMahon, Mr. Martin | male | NaN | 0 | 0 | 370372 | 7.7500 | NaN | Q |
| 150 | 151 | 0 | 2 | Bateman, Rev. Robert James | male | 51.0 | 0 | 0 | S.O.P. 1166 | 12.5250 | NaN | S |
| 573 | 574 | 1 | 3 | Kelly, Miss. Mary | female | NaN | 0 | 0 | 14312 | 7.7500 | NaN | Q |
| 347 | 348 | 1 | 3 | Davison, Mrs. Thomas Henry (Mary E Finck) | female | NaN | 1 | 0 | 386525 | 16.1000 | NaN | S |
| 866 | 867 | 1 | 2 | Duran y More, Miss. Asuncion | female | 27.0 | 1 | 0 | SC/PARIS 2149 | 13.8583 | NaN | C |
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset does not contain duplicate rows. | |||||||||||||
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset does not contain duplicate rows. | |||||||||||||